Stock Market Data for
Teslaand
The dataset we use for the following statistical analysis is stock
market data for Elon Musk’s publicly traded companies Tesla
and Twitter from \(01/01/2018\) to \(05/20/2022\). We obtained this data from
the Yahoo Finance API using the package quantmod in R. The
data contains the daily percentage returns for the Tesla and Twitter
stock indexes, with \(2208\)
observations on the following \(13\)
variables.
| variable | type | description |
|---|---|---|
| symbol | character | The ticker symbol uniquely indefintying a stock |
| date | datetime | The trade day of the recorderd observation |
| open | float | Opening value of the stock that day |
| close | float | Closing value of the stock that day |
| high | float | Highest price of the stock on a given trade day |
| low | float | Lowest price of the stock on a given trade day |
| volume | integer | Number of daily shares traded in billions |
| direction | factor | Factor indicating whether the market had a positive or negative return |
| return | decimal | Percentage return for that day |
| lag1 | decimal | Percentage return for previous day |
| lag2 | decimal | Percentage return for 2 days previous |
| lag3 | decimal | Percentage return for 3 days previous |
| lag4 | decimal | Percentage return for 4 days previous |
Here, we use the pairs function to create a scatterplot
matrix for every pair of variables in the stock dataset as shown
below.
df.pairs <- df %>% dplyr::select(-alltext)
pairs(stocks.data)
Based on the correlation coefficients and their corresponding
p-values, there is indeed an association between the
daily return rate and the predictors volume,
lag2, nfav, nretweet, and
nreply.